Effect of nonnegativity on estimation errors 
in one-qubit state tomography with finite data 
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We analyze the behavior of estimation errors evaluated by two loss functions, the Hilbert-Schmidt 
distance and infidelity, in one-qubit state tomography with finite data. We show numerically that 
there can be a large gap between the estimation errors and those predicted by an asymptotic analysis. 
The origin of this discrepancy is the existence of the boundary in the state space imposed by the 
requirement that density matrices be nonnegative (positive semidefinite). We derive an explicit form 
of a function reproducing the behavior of the estimation errors with high accuracy by introducing 
two approximations: a Gaussian approximation of the multinomial distributions of outcomes, and 
linearizing the boundary. This function gives us an intuition for the behavior of the expected losses 
for finite data sets. We show that this function can be used to determine the amount of data 
necessary for the estimation to be treated rehably with the asymptotic theory. We give an explicit 
expression for this amount, which exhibits strong sensitivity to the true quantum state as well as 
the choice of measurement. 
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I. INTRODUCTION 

Quantum tomography has become a standard mea- 
surement technique in quantum physics. It is especially 
important in the field of quantum information as it is 
used for the confirmation of successful experimental im- 
plementation of quantum protocols. For example, it can 
be used to confirm that the quantum states required in 
a quantum information protocol are sufficiently closed 
to their theoretical targets In practice, experimen- 
tal data obtained from tomographic measurements are 
used to assign a mathematical description to an unknown 
quantum state or operation, called an estimate. Statis- 
tically, this is a constrained multi-parameter estimation 
problem - the quantum estimation problem - where we 
assume we are given a finite number of identical copies 
of a quantum state or process, we perform measurements 
whose mathematical description is assumed to be known, 
and from the outcome statistics we make our estimate. 
Due to the probabilistic behavior of the measurement 
outcomes and the finiteness of the number of measure- 
ment trials, there always exist statistical errors in any 
quantum estimate. The size of the error depends on the 
choice of the measurements, known as the experimental 
design, and the estimation algorithm, known as the esti- 
mator. A standard combination in quantum information 
is that of quantum tomography and maximum likelihood 
estimation Q^. In order to compare estimation schemes, 
it is therefore important to evaluate precisely the size of 



sugiy ama@eve.phys . s . u- tokyo. ac . j p | 
turner? 



Sphys . s . u- tokyo. ac.jp 
Sphys . s . u-tokyo .ac.jp 



the estimation error for a given combination of experi- 
mental design and estimator. 

For evaluating the size of the estimation error, we in- 
troduce a distance-like function, called a loss function, 
between the estimate and the true operator. One way 
to evaluate estimation errors using a loss function is an 
expected loss, which is the statistical expectation value 
of the loss function over all possible data sets. In quan- 
tum information experiments, the infidelity (one minus 
the fidelity) and the trace distance are often used as loss 
functions for state estimation. These evaluations are of- 
ten performed in the theoretical limit of infinite data, 
called the asymptotic regime. The asymptotic behavior 
of these expected losses for this combination has been 
studied very well [2, 3]. Using the asymptotic theory 
of parameter estimation, we can show that for a suffi- 
ciently large number of measurement trials, N, there is 
a lower bound of the expected losses, called the Cramer- 
Rao bound. It is known that a maximum likelihood es- 
timator achieves the Cramer-Rao bound asymptotically, 
and that those expected losses decrease as 0{1/N). 

In practice of course, no experiment produces infinitely 
many data, and there are problems in applying the 
asymptotic theory of expected losses to finite data sets. 
First of all, the Cramer-Rao inequality holds only for a 
specific class of estimators, namely those that are unbi- 
ased. A maximum likelihood estimator is asymptotically 
unbiased, but is not unbiased for finite iV, so the ex- 
pected losses can be smaller than the bound for finite 
N. Especially when the purity of the true density ma- 
trix becomes high, the bias becomes larger. This is due 
to the boundary in the parameter space imposed by the 
condition that density matrices be positive semidefinite, 
and the expected losses can deviate significantly from the 
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asymptotic behavior [j, l5| . A natural question is then to 
ask at what value of N the expected losses begin to be- 
have asymptotically. If N is large enough for the effect of 
the bias to be negligible, we can safely apply the asymp- 
totic theory for evaluating the estimation error in an ex- 
periment. However, in general, determining the effects of 
the bias is a difficult problem. 

In this paper, we analyze the effect of the bias caused 
by the parameter space boundary in one-qubit state to- 
mography using a maximum likelihood estimator. In sec- 
tion |lll we briefly review quantum state tomography and 
the asymptotic theory. In section IIII[ we analyze the 
boundary effect theoretically. Applying ideas from clas- 
sical statistical estimation theory, we derive an approxi- 
mate form of the expected losses for finite N. In section 
IIVI we analyze the boundary effect numerically, giving 
the results of our pseudo-random numerical experiments. 
These indicate that the function we derived reproduces 
the behavior of the expected losses for finite N more pre- 
cisely than the Cramer-Rao bound. This makes it pos- 
sible to predict the point at which the behavior of the 
expected infidelity becomes effectively asymptotic. We 
conclude in section IVl 

II. QUANTUM STATE TOMOGRAPHY AND 
ASYMPTOTIC ESTIMATION THEORY 

In this section, we give a brief review of known results 
in quantum state tomography and asymptotic estimation 
theory. The purpose of quantum state tomography is to 
identify the density matrix characterizing the state of a 
quantum system of interest. Here we only consider states 
of a single qubit. Let T-l be the 2-dimensional Hilbert 
space and iS(C^) be the set of all positive semidefinite 
density matrices acting on H. Such a density matrix p 
can be parametrized as 

p{s) = ^{l + s-a), (1) 

where 1 is the identity matrix on C^, cr = {(Ji, (72, crs)"^ 
is the vector of Pauli matrices, and s E M.^, \\s\\ < 1, 
is called the Bloch vector. Let us define the parameter 
space S := {s\ p{s) g 5(C^)}. Identifying the true den- 
sity matrix p e iS(C^) is equivalent to identifying the 
true parameter s € S. Let II = {n^jjajgAr denote the 
POVM characterizing the measurement apparatus used 
in the tomographic experiment, where X is the set of 
measurement outcomes. Like a density matrix, a POVM 
can be parametrized as 

=^ V^l + Wj; ■ CT, (2) 

where {vx,Wx) € M"*. When the true density matrix is 
p{s), Horn's Rule tells us that the probability distribu- 
tion describing the tomographic experiment is given by 

p{x\s) =Tr[p(s)H,] (3) 
= Vx+Wx- s, (4) 



where Tr denotes the trace operation with respect to C^. 
We assume that in the experiment we prepare identical 
copies of an unknown state p{s). We perform N mea- 
surement trials and obtain a data set x'^ = [xi, . . . , xn), 
where Xi € X is the outcome observed in the i-th trial. 
Let Nx denote the number of times that outcome x occurs 
in x'^ , then fN{x) := N^/N is the relative frequency of x 
for the data set x^ . In the relative frequency interpreta- 
tion of probability, one has that in the limit of — )■ oo, 
fN{x) converges to the true probability p(a;|s). A POVM 
is called informationally complete if Tr[pHa;] = Tr[p'Hj;] 
has a unique solution p' for arbitrary p S 5(7^) @. This 
condition is equivalent to that of the POVM 11 being a 
basis for the set of all Hermitian matrices on H. For fi- 
nite N, the relative frequency and true probability are 
generally not the same, i.e., there is unavoidable statisti- 
cal error, and we need to choose an estimation procedure 
that takes the experimental result to a density ma- 
trix, that is, we need an estimator. 

It is natural to consider a linear estimator, which de- 
mands that wc find a 2 x 2 matrix satisfying 

Tt[pIU,] = /^(x), X G X. (5) 

However, Eq.® does not always have a solution, and 
even when it does, although the solution is Hermitian 
and normalized, it is not guaranteed that p'^ is positive 
semidefinite. Let us explore this point further in the one 
qubit case. The positive semidefinite condition restricts 
the physically permitted parameter region to the ball 
B :— {s E Ii^\\\s\\ < 1}. On the other hand, a linear esti- 
mate is a random variable that can take values anywhere 
in the cube C {s e H^] - 1 < < l,a = 1,2,3}. 
There is therefore a 'gap' between B and C, consisting of 
unphysical linear estimates. When the true Bloch param- 
eter s is in the interior of B and N becomes sufhciently 
large, the probability that linear estimates are out of B 
becomes negligibly small. However, when the Bloch vec- 
tor is on the boundary of _B, or when N is not sufficiently 
large, the effect of unphysical linear estimates cannot be 
ignored. A maximum likelihood estimator p™' is one way 
to address these problems. The estimated density matrix 
and the Bloch vector are defined as 

:= argmaxpg5(^) n!Ii Tr[pn^J, (6) 
sf := argmax.^s Jlti Tr[p(s)n,J. (7) 

It can be shown that when p^ e SCH), p\ — p^' holds 

In order to evaluate the precision of estimates, we in- 
troduce a loss function. A loss function A is a map 
from S{H) x S{H) to M such that (i) Vp, ct G S{H), 
A{p,a) > 0, and (n) Vp G 0,A(p,p) = 0. For exam- 
ple, the trace-distance and the infidelity (one minus the 
fidelity) are loss functions for density matrices. For our 
loss functions, we use both the squared Hilbert-Schmidt 
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distance A^^ and the infidelity A^^ Q defined as 



Tr 



2 



A^^ {s, s') := 1 - Tr 



(8) 
(9) 

(10) 



1-s-s' 



yTHNPyT~MF()i) 



The Hilbert-Schmidt distance is a normalized Euclidean 
distance in the parameter space, and the infidelity is 
a conventional loss function used in experiments. We 
note that the Hilbert-Schmidt distance coincides with the 
trace distance in one-qubit systems, but it does not in 
general. 

The outcomes of quantum measurements are random 
variables, and the value of the loss function between an 
estimate and the true density matrix is also a random 
variable. Thus, in order to evaluate the precision of a 
general estimator p°^^ (not the estimate) for the true den- 
sity matrix, we use the statistical expectation value of the 
loss function, called an expected loss (sometimes called a 
risk function) [? ] . The explicit form is given by 

A^(p-V):- E P(^^|p)A(p°/(a:^),p). (12) 

The value of the expected loss depends on the choice of 
the estimator as well as the true density matrix. The 
latter is of course unknown in an experiment, and one 
way to eliminate its dependence is to average over all 
possible true states 

^TiP'^') / dM(p)A^(p-*|p), (13) 
J pes 

where p is a probability measure on S. The purpose of 
this paper is to clarify the behavior of expected losses for 
true states close to or on the boundary of B, so we focus 
not on average but pointwise expected losses for those 
states. 

Let us assume that < 1. For any unbiased esti- 
mator s°^^ and any positive semidefinite matrix Hs, the 
inequality 

AAr(s<'^*|s) 



> -tT[HsF-'] 



holds, where 



-E 

xeX 



V,p(.t|8)VJp(x|s) 
p{x\s) 



(14) 

(15) 
(16) 



is called the Fisher matrix and tr denotes the trace oper- 
ation with respect to the parameter space R"^. Equation 
(fT4| is called the Cramer-Rao inequality, and it holds 
not only for one-qubit state tomography, but also for ar- 
bitrary finite dimensional parameter estimation problems 
under some regularity condition 12]. The matrix Fg is 
a 3 X 3 positive semidefinite matrix for s S R"^. It is 
known that a maximum likelihood estimator asymptoti- 
cally achieves the equality of Eq. ([14]) jT2| . From the ex- 
plicit formulas for the squared Hilbert-Schmidt distance 
and infidelity in Eqs. © and ([TT|) . we have 



(17) 



4V' ' l-||s||2 
+Oi\\s'-s\n 



{s'-s) 



(18) 



where / is the identity matrix on R'^. Therefore when 
we use the Hilbert-Schmidt distance as our loss function, 
we substitute in Eq. ^ by H^^ := \l. On the 
other hand, when our loss function is the infidelity, we 

must use H^f := \\I + , m | . These two matrices 
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and H^f are half of the Hesse matrices for A^^ and 
A^^, respectively. 

The Cramer-Rao inequality is often used to evaluate 
the estimation errors of a maximum likelihood estimator, 
but there are problems applying the bound to evaluating 
the expected losses for finite data sets. The inequality 
holds for unbiased estimators, which maximum likelihood 
estimators are asymptotically. However, they are not 
unbiased for finite iV, because of the existence of the 
boundary in the parameter space. It has been shown 
numerically that for values of N in typical experiments 
the Cramer- Rao bound cannot be applied @. Hence, 
we are motivated to investigate the behavior of expected 
losses in parameter spaces with boundaries for finite data 
sets. We undertake this investigation for one-qubit in the 
next section. 



III. THEORETICAL ANALYSIS 

In this section, we derive a function which approxi- 
mates the expected losses of the squared Hilbert-Schmidt 
distance and infidelity for finite data sets. 



A. Two approximations 

In general, the explicit form of expected losses with fi- 
nite data sets is extremely complicated. In this paper, we 
try to derive not the exact form but a simpler function 
which reproduces the behavior of the true function accu- 
rately enough to help us understand the boundary effect. 
In order to accomplish this, we introduce two approxi- 
mations. First, we approximate the multinomial distri- 
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bution generated by successive trials by a Gaussian distri- 
bution. Second, we approximate the spherical boundary 
by a plane tangent to its boundary. 

From the central limit theorem, we can readily prove 
that the distribution of a linear estimator s'' converges 
to a Gaussian distribution with mean s and covariance 
matrix F~^. For finite N, we approximate the true prob- 
ability distribution by the Gaussian distribution 

Pcis^kl^) ■■= 

=exp[-^(4-s).F,(s^-s)].(19) 
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We will refer to this as the Gaussian distribution ap- 
proximation (GDA). Because the approximation of the 
multinomial distribution by the GDA becomes better as 
each outcome probability grows sufficiently larger than 
0, the expected losses under the GDA should be closer to 
the true expected losses the farther the true Bloch vec- 
tor is from alignment with the axes in the Bloch sphere 
defined by the measurement. 

For a one-qubit system, the boundary between the 
physical and unphysical regions of the state space is a 
sphere with unit radius. Despite its simplicity, it is dif- 
ficult to derive the explicit formula of a maximum likeli- 
hood estimator even in this case. Indeed, this is a major 
contributor to the general complexity of the expected loss 
behavior in quantum tomography. We therefore choose 
the simplest possible way to approximate the boundary, 
namely by replacing it with a plane in the state space. 
Suppose that the true Bloch vector is s G B. The bound- 
ary of the Bloch ball, dB, is represented as 

dB:={s' eWe\\\s'\\=l}. (20) 

We approximate this by the tangent plane to the sphere 
at the point represented as 

dDs:={s' eR^l s-{s' -es)=0}, (21) 

and so the approximated parameter space is represented 



Ds = {s' e R3| s- (s'-e^) < 0}. 



(22) 



We will refer to this as the linear boundary approxi- 
mation (LB A). The LB A is a specific case of tangent 
cone methods in statistical estimation theory which have 
been developed and used for analyzing models with con- 
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ory [131 . LL4| . It is known that the distribution of a max- 
imum likelihood estimator in a constrained parameter 
estimation problem converges to the Gaussian distribu- 
tion with a boundary approximated by a tangent cone 
14| . Therefore it is guaranteed that the expected losses 
approximated by the GDA and LBA converge to their 
true values in the limit of infinite data. 



B. Approximated maximum likelihood estimator 

In [l3|; it is proved that the distribution of a max- 
imum likelihood estimator in a constrained parameter 



estimation problem converges to the distribution of the 
following vector 



s'n — argmin^,g£,^ 



Xsl ~ s') ■ F^isl ~ s'). (23) 



By using the Lagrange multiplier method, we can derive 
the approximated maximum likelihood estimates as 
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(24) 



We note that sj^' depends on the true parameter s, and 
so by definition it is not an estimator - it is a vector 
introduced for the purpose of approximating expected 
losses of a maximum likelihood estimator. Intuitively, 
it takes the value of the linear estimate if that estimate 
is physical, and if it is unphysical a correction vector is 
added to bring it back within the physical region. 



C. Expected squared Hilbert-Schmidt distance 

From a straightforward calculation using formulas for 
Gaussian integrals, we can derive the approximate ex- 
pected squared Hilbert-Schmidt distance. 



A]^S(5-i| 



tr[i^-i] 



1 fia 

2 e. 



-erfc 



1 1; 

4 y^e 



+ 1(1-11^11') 



7 erfc 



where 



2 f°° 
erfc [a] / dt 

Vl" J a 



is the complementary error function and 



N* 



■F: 



(^5) 



(26) 



(27) 



is a typical scale for the number of trials. By using 
the Gramer-Rao inequality, Eq. ((Ti)) . we can prove that 
es ■ F-^Es/N is the variance of the linear estimates 
in the e^, direction of the Bloch sphere. When N is suffi- 
ciently large, most of the distribution of linear estimates 
is included in Ds and the effect of the boundary becomes 
negligible. Roughly speaking, this condition is repre- 
sented as Es ■ F~^es/N ^ (1 — ||s||)^, where the right 
hand side is the squared Euclidean distance between s 
and Ds- This can be rewritten as 
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es ■ F: 



{l-\\s\\r 2 



N* 



(28) 



We interpret N* as a reasonable benchmark for judging 
whether most of the distribution of the linear estimates is 
included in the physical region or not. The factor of 2 in 
Eq. (j27p comes from the Gaussian integration, though in 
defining A^* it is fairly arbitrary as it makes precise what 
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TABLE I. List of the true Bloch vectors under consideration (in spherical coordinates) , and numerical values of A''* (rounded 
down, when possible). 



ir,e,<t>) 


(0.9,0,0) 


(0.9,7r/4,7r/4) 


(0.99,0,0) 


(0.99,7r/4,7r/4) 


(l,7r/4,V4) 


Panels 


(EIF-1) 


(EIF-2) 


(EIF-3) 


(EHS-1), (EIF-4) 


(EHS-2), Figure H 


TV* 


114 


417 


1194 


37947 





we mean by 'most' in the preceding sentence. Thus, in 
order to justify the use of the Cramer- Rao bound for eval- 
uating the estimation error, the number of measurement 
trials, N, must be larger than N* . 

When ||s|| < 1, in the limit of iV ^ oo, eTfc[y/N/N*] 
decreases exponentially fast. This can be readily shown 
by using the asymptotic expansion (l5| . 



erfc[o 



1 



m — 1 



i-iy 



1 • 3- • -(2771-1) 



(2a2) 



2\m 



.(29) 



Therefore we can see that the approximate expected 
squared Hilbert-Schmidt distance converges to the 
Cramer- Rao bound. On the other hand, when ||s|j = 1, 
the first and second terms disappear and we obtain 



1 1 

In 



2e. 



f: 



Fa 6 c 



(30) 



where we assumed that Fg < oo for a Bloch vector s 
with ||s|| = 1. This is smaller than the Cramer- Rao 
bound, and this implies that when the true state is pure, 
a maximum likelihood estimator can break the Cramer- 
Rao bound even in the asymptotic region. 



D. Expected infidelity 

In order to analyze the expected infidelity, we take the 
Taylor expansion of the infidelity around the true Bloch 
vector s up to the second order. The explicit form is in 
Eq. (|18|) . Again, using formulas for Gaussian integrals 
we can derive the approximate expected infidelity. When 
\\s\\ < 1, 



^IFi'Sml 



1 1 - 



s ■ F-^s 



4 V27re, ■ Fs^e 

\t\F-^ 



where 



- tv[{QsFsQ 
s||)erfc 

Qs 



1 erfc 

2 



s ■ F-^s 



^-N/N' 



N 



(31) 



(32) 



is the projection matrix onto the subspace orthogonal to 
s, and A~ is the Moore-Penrose generalized inverse of a 



matrix A. From the argument above, we can see that the 
approximate expected infidelity converges to the Cramer- 
Rao bound in the limit of large TV. 

When ||s|| = 1, the infidelity is a 1st order function 
of s, given by A^^(s, s') = ^(1 — s ■ s'), and there are 
no 2nd-order terms. Consequently, the Hesse matrix of 
the infidelity diverges at = 1. Therefore we 
cannot apply the Cramer-Rao inequality to the infidelity 
for pure states. By calculating the expectation value of 
the approximate estimator s™', we can obtain 



A^|(5-'|s) = 



■Fis) 



1 



27r 



TV 



(33) 



IV. NUMERICAL ANALYSIS 

We performed Monte Carlo simulations of one-qubit 
state tomography using three orthogonal projective 
(XYZ) measurements. Our task is to estimate the den- 
sity matrix of the one-qubit system, where the true state 
can be pure or mixed. We choose a maximum likelihood 
estimator, and we used a Newton-Raphson method to 
solve the (log-)likelihood equation with the completely 
mixed state s = as the initial point of the iteration. 
When the procedure returned a candidate point outside 
of the Bloch sphere, we chose the previous point (within 
the sphere) as the estimate. 

The POVM corresponding to three orthogonal projec- 
tive measurements is given by 



^\a+){a+\,^\a-){a-\ 



(34) 



Q=l,2,3 



where \a±.) are the eigenstates of tJa with eigenvalue ±1. 
The Fisher matrix and its inverse are given by 



Fs = - 
3 







K"^ = 3 I 1 - 














1^4 



(35) 



In Sees. IIV Al and IIVB[ we show the plots for two 
loss functions: the squared Hilbert-Schmidt distance A^^ 
and the infidelity A^^. The pointwise expected losses 
Ajv(s™'|s) and the approximated functions AAr(s'"'|s) 
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0.99 0.999 
Bloch radius, r 



FIG. 1. Bloch radius dependency of A''* for standard quantum 
state tomography, given in Eq. (|36|) . The sohd hne is for 
states s given by (r, 0,0), and the dashed hne is for those 
given by (r, 7r/4, 7r/4). 



introduced in Sec. Illll are compared, and the accuracy of 
those approximations are discussed. Table |T] is a list of 
true Bloch vectors s for the figures shown in the follow- 
ing subsections, along with the numerical values of N* 
for each s. We chose two Bloch radii, r = 0.9, 0.99, and 
two set of angles {6,(j)) — (0, 0), (7r/4, 7r/4) as the true 
Bloch vector s. For a fixed r, the case with angles (0, 0) 
corresponds to one of the best case scenarios because 
the Bloch vector is along one measurement axis, while 
the (7r/4, 7r/4) case corresponds to a worst case scenarios 
because the Bloch vector is equidistant from all the mea- 
surement axes. The explicit form of N* for the Fisher 
matrix in Eq. psp is 



N* =6 



1 



1 



There are two terms which contribute to the divergence 
at ||s|| = 1, and near this value the first term behaves as 
0((1- while the second does as 0((1 - ||s||)-2). 

When the true Bloch vector is along one of the measure- 
ment axes, the second term in Eq. (1361) disappears. For 
example, if s = (r, 0, 0), we obtain N* = ^ 
as r — >■ 1. On the other hand, when the true Bloch vec- 
tor does not lie along any measurement axis, the second 
term remains. For example, if s = (r, 7r/4, 7r/4), we ob- 
tain TV* = efiii^ I y} \ - iTTr^- Therefore 
A''* for a true Bloch vector whose direction is along one 
of the measurement axes becomes smaller than that for a 
true Bloch vector whose direction is not. This difference 
caused by the alignment of measurement axes becomes 
larger as the purity of p{s) becomes higher. 

The terms caused by the boundary in Eqs. (1251) . (pO|. 
([3T|) . ([33]) start to decrease exponentially fast after iV be- 
comes larger than N* . We expect that the simulated and 
approximated plots start to converge to the Cramer-Rao 



bound after iV becomes larger than iV*. In all figures, 
the line styles are as follows: a solid (black) line for the 
numerically simulated expected loss AAr(s™'|s)^ a dashed 
(red) line for the approximate expected loss AAr(s™'|s) 
given in Eqs ([25]), dSO]), dSS]), a chain (green) hne 

for the Cramer-Rao bound, and a dotted (black) vertical 
line for N* . 



A. Expected squared Hilbert-Schmidt distance 

The Cramer-Rao bound of the expected squared 
Hilbert-Schmidt distance is given by 



tr[i/HSF, 



-(3- 



1 



(37) 



N 4" 

Figure [5] shows the pointwise expected squared Hilbert- 
Schmidt distance A^^ plotted against the number of tri- 
als N (the horizontal and vertical axes are both logarith- 
mic scale). The panels (EHS-1) and (EHS-2) are for the 
true Bloch vector s given by (r, 6*, (/>) = (0.99, 7r/4, 7r/4) 
and (r, 6*, 0) = (1, 7r/4, 7r/4), respectively, so that the for- 
mer is (slightly) mixed, while the latter is pure. The 
panel (EHS-1) shows that our approximation in Eq. (|25|) 
converges to the simulated plot, and both the simu- 
lated and approximated plots converge to the Cramer- 
Rao bound of Eq. l[37|) as TV becomes large. The same 
behavior is observed for other mixed true states. On the 
other hand, panel (EHS-2) shows a different behavior; 
our approximation in Eq. (|30p converges to the simulated 
plot, but the simulated and approximated plots do not 
converge to the Cramer-Rao bound. This indicates that 
for pure states, our approximation better captures the 
behavior of the expected loss than does the Cramer-Rao 
bound. As mentioned around Eq. ([30|) . the reason for 
this is that the center of the distribution of the linear es- 
timates for a pure state will always be on the boundary of 
the Bloch sphere, so that about a half of the distribution 
will always be in the unphysical region. This prohibits 
a maximum likelihood estimator from ever converging to 
the Cramer-Rao bound. 



B. Expected infidelity 

The infidelity is a nonlinear function of the states, and 
we must approximate the Cramer-Rao bound in this case; 
doing so up to second order gives 



N 



,(S1S2)^ + (S2S3)^ + (S3S1)^ 



Figure [3] shows the pointwise expected infidelity 
A]^ plotted against the number of measure- 
ment trials TV: (EIF-1), (EIF-2), (EIF-3), (EIF- 
4) are for the true Bloch vector s given by 
(r,6',(?!)) = (0.9, 0,0), (0.9, 7r/4,7r/4), (0.99, 0,0), and 
(0.99,7r/4,7r/4), respectively. Thus panels (EIF-1,2) and 
panels (EIF-3, 4) ) are for true states with the same 
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FIG. 2. Pointwise expected squared Hilbert-Schmidt distance A^v^ plotted against the number of measurement trials A'^: (EHS- 
1) and (EHS-2) are for the true Bloch vector s given by (r, 9, (j>) — (0.99, 7r/4, 7r/4) and (1, 7r/4, 7r/4), respectively. The number 
of sequences used for the calculation of the statistical expectation values is 10 000. 
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FIG. 3. Pointwise expected infidelity Aj^ plotted against the number of measurement trials A'': (EIF-1), (EIF-2), (EIF-3), (EIF- 
4) are for the true Bloch vector s given by {r,9,cf)) = (0.9, 0, 0), (0.9, 7r/4, 7r/4), (0.99, 0, 0), and (0.99, 7r/4, 7r/4), respectively. 
The number of sequences used for the calculation of the statistical expectation values is 10000. 



purity. Panels (EIF-1) and (EIF-3) are for the case that one of the measurement axes coincides with the 
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FIG. 4. Pointwise expected infidelity AJ^ plotted against the 
number of measurement trials A'' for the true Bloch vector s 
given by {r^d,(f)) = (1, 7r/4, 7r/4). The number of sequences 
used for the calculation of statistical expectation values is 10 
000. 

direction of the true Bloch vector, while panels (EIF-2) 
and (EIF-4) are for the case that all of the measurement 
axes are as far as possible from the true Bloch vector. 
Figure [3] shows that N* is a good benchmark for the 
number of trials required for the simulated plot to start 
to converge to the Cramer-Rao bound, and so we can 
say that in order to justify the use of the asymptotic 
theory, N must be larger than N* . Figure [3] indicates 
that the angle dependency of the expected infidelity 
becomes larger as the purity becomes higher. When 
the true state is far from all measurement axes, the 
accuracy of our approximation is higher than that of the 
Cramer-Rao bound. For N smaller than about 10 000 
(the 'low N region'), the accuracy of our approximation 
is low (though still higher than that of the Cramer-Rao 
bound). We believe that the main reason for our 
approximation's poor performance in this low N region 
is the second order approximation of the infidelity, and 
that higher orders would improve the accuracy here. 
However, in the high N region the approximation can 
be seen to capture the behavior of the curve far better 
than the Cramer-Rao bound. 

Figure S] shows the pointwise expected infidelity A]^ 



against the number of measurement trials N for the true 
Bloch vector s given by (r, 9, </>) = (1, 7r/4, 7r/4). For pure 
true states, the expected infidelity decreases as 0{^/N), 
and Fig. U shows that the expected infidelity converges 
to the approximate function. 

V. CONCLUSIONS 

In this paper, we analyzed expected losses in one- 
qubit state tomography for finite data sets. We de- 
rived an explicit formula of the expected squared Hilbert- 
Schmidt distance and the expected infidelity between a 
tomographic maximum likelihood estimate and the true 
state under two approximations: a Gaussian distribution 
matched to the moments of the asymptotic multinomial 
distribution, and a linearization of the parameter space 
boundary imposed by the positivity of quantum states. 
We performed Monte Carlo simulations of one-qubit state 
tomography and evaluated the accuracy of the approx- 
imation formulas by comparing them to the numerical 
results. The numerical comparison shows that our ap- 
proximation reproduces the behavior in the nonasymp- 
totic regime much better than the asymptotic theory, and 
the typical number of measurement trials derived from 
the approximation is a reasonable threshold after which 
the expected loss starts to converge to the asymptotic 
behavior. 
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